580 research outputs found

    Rotate Intra Block Copy for Still Image Coding

    Get PDF
    This paper proposes a method called rotate intra block copy, which extends the intra block copy technique by making the block matching process invariant to rotation. HEVC intra prediction plus rotate intra block copy gives an average of 20% reduction in residual energy (i.e. prediction error) compared to HEVC intra prediction plus intra block copy. As the motion vector correlation in rotate intra block copy is different from the intra block copy, a new method of motion vector coding is presented. The impact of angular resolution on residual energy reduction is also evaluated. In a full codec pipeline, this reduction in residual energy translates into a coding gain in BD-rate of 3.4% over HEVC intra prediction plus intra block copy for both screen content and camera-captured gray scale images.Samsung (Firm). Global Research Outreach Progra

    ShenZhen transportation system (SZTS): a novel big data benchmark suite

    Get PDF
    Data analytics is at the core of the supply chain for both products and services in modern economies and societies. Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. In this paper, we propose ShenZhen Transportation System (SZTS), a novel big data Hadoop benchmark suite comprised of real-life transportation analysis applications with real-life input data sets from Shenzhen in China. SZTS uniquely focuses on a specific and real-life application domain whereas other existing Hadoop benchmark suites, such as HiBench and CloudRank-D, consist of generic algorithms with synthetic inputs. We perform a cross-layer workload characterization at the microarchitecture level, the operating system (OS) level, and the job level, revealing unique characteristics of SZTS compared to existing Hadoop benchmarks as well as general-purpose multi-core PARSEC benchmarks. We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets

    RAR-U-Net: a Residual Encoder to Attention Decoder by Residual Connections Framework for Spine Segmentation under Noisy Labels

    Full text link
    Segmentation algorithms of medical image volumes are widely studied for many clinical and research purposes. We propose a novel and efficient framework for medical image segmentation. The framework functions under a deep learning paradigm, incorporating four novel contributions. Firstly, a residual interconnection is explored in different scale encoders. Secondly, four copy and crop connections are replaced to residual-block-based concatenation to alleviate the disparity between encoders and decoders, respectively. Thirdly, convolutional attention modules for feature refinement are studied on all scale decoders. Finally, an adaptive denoising learning strategy(ADL) based on the training process from underfitting to overfitting is studied. Experimental results are illustrated on a publicly available benchmark database of spine CTs. Our segmentation framework achieves competitive performance with other state-of-the-art methods over a variety of different evaluation measures

    A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps

    Get PDF
    This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.United States. Defense Advanced Research Projects Agenc

    A 58.6 mW 30 Frames/s Real-Time Programmable Multiobject Detection Accelerator With Deformable Parts Models on Full HD 1920×1080 Videos

    Get PDF
    This paper presents a programmable, energy-efficient, and real-time object detection hardware accelerator for low power and high throughput applications using deformable parts models, with 2x higher detection accuracy than traditional rigid body models. Three methods are used to address the high computational complexity of eight deformable parts detection: classification pruning for 33x fewer part classification, vector quantization for 15x memory size reduction, and feature basis projection for 2x reduction in the cost of each classification. The chip was fabricated in a 65 nm CMOS technology, and can process full high definition 1920 × 1080 videos at 60 frames/s without any OFF-chip storage. The chip has two programmable classification engines (CEs) for multiobject detection. At 30 frames/s, the chip consumes only 58.6 mW (0.94 nJ/pixel, 1168 GOPS/W). At a higher throughput of 60 frames/s, the CEs can be time multiplexed to detect even more than two object classes. This proposed accelerator enables object detection to be as energy-efficient as video compression, which is found in most cameras today.United States. Defense Advanced Research Projects AgencyTexas Instruments Incorporate

    Sparkle Vision: Seeing the World through Random Specular Microfacets

    Get PDF
    In this paper, we study the problem of reproducing the world lighting from a single image of an object covered with random specular microfacets on the surface. We show that such reflectors can be interpreted as a randomized mapping from the lighting to the image. Such specular objects have very different optical properties from both diffuse surfaces and smooth specular objects like metals, so we design special imaging system to robustly and effectively photograph them. We present simple yet reliable algorithms to calibrate the proposed system and do the inference. We conduct experiments to verify the correctness of our model assumptions and prove the effectiveness of our pipeline

    A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps

    Get PDF
    This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.United States. Defense Advanced Research Projects Agenc
    • …
    corecore